Journal of the American Society for Mass Spectrometry
● American Chemical Society (ACS)
Preprints posted in the last 90 days, ranked by how well they match Journal of the American Society for Mass Spectrometry's content profile, based on 33 papers previously published here. The average preprint has a 0.02% match score for this journal, so anything above that is already an above-average fit.
Takeda, H.; Asakawa, D.; Takeuchi, M.; Tsugawa, H.
Show abstract
Sphingolipids are diverse lipids with sphingobases and N-acyl fatty acids as the hydrophobic moieties. While the importance of the in-depth elucidation of hydrophobic structures is widely recognized in lipid biology, mass spectrometry-based annotation of ceramides in the commonly used protonated form is often hindered by in-source dehydration during electrospray ionization in the heated state and variable water losses in the product ion spectrum. In this study, we investigated the sodium ion form and its product ions in ceramides with the use of electron-activated dissociation tandem mass spectrometry (EAD MS/MS) in addition to collision-induced dissociation to facilitate indepth structural elucidation. While dehydrated ions from the protonated form were frequently observed, the sodium adduct ions remained stable because of their higher activation energy compared with the protonated form, which was validated using quantum chemical calculations. Using the three adduct forms under optimized conditions increased confidence in annotating the ceramide peaks through retention-time matching. Furthermore, EAD MS/MS of the sodium adduct ions facilitated the positional determination of double bonds and hydroxyl groups in the ceramide hydrophobic moiety. Our approach is showcased by the annotation of phytoceramides with N-acyl 2- and 3-hydroxyl groups in mouse feces and ceramides with N-acyl n-6 very long-chain polyunsaturated 2-hydroxy fatty acids in mouse testis.
Cyuzuzo, C. I.; Kruk, M.; Zhang, Q.; Ashareef, D.; Harmon, J.; Machida, Y. J.; VanKoten, H. W.; More, S. S.; Campbell, C.; Tretyakova, N. Y.
Show abstract
Oxidative DNA damage caused by endogenous reactive oxygen species (ROS) is a key driver of mutagenesis, cellular dysfunction, and aging, contributing to diseases like cancer, neurodegeneration, rheumatoid arthritis, cardiovascular disorders, and diabetes. Although more than 20 oxidative base lesions have been identified, ROS-induced DNA-protein crosslinks (DPCs) are poorly characterized. ROS-DPCs are unusually bulky and highly toxic lesions that accumulate in metabolically active tissues with age, but their identities, biological consequences, and repair in living cells have remained elusive. In the present work, we characterized ROS-DPCs in human fibrosarcoma (HT1080) cells treated with hydrogen peroxide (H2O2) and elucidated the mechanisms of their removal. Mass spectrometry-based proteomics has identified over 100 cellular proteins that participated in DPC formation, most of which are involved in DNA metabolism. Our data further reveal that DNA replication and transcription facilitate DPC detection and identify a critical role of the ubiquitin-proteasomal system (UPS), replication-coupled activity of SPRTN metalloprotease, and nucleotide excision repair (NER) in removing ROS-induced DPCs. ROS-DPC formation was blocked by pretreatment with metabolically stable and cell-permeable glutathione (GSH) analog ({Psi}-GSH), suggesting a possible therapeutic strategy for preventing diseases associated with increased ROS levels. KEY POINTSMass spectrometry-based proteomics identified over 100 proteins participating in DNA-protein cross-links in human cells treated with ROS Our work reveals the mechanisms through which living cells recognize and remove ROS-DPCs Our study demonstrates the potential of a glutathione analog to prevent ROS-DPC formation GRAPHICAL ABSTRACT O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=75 SRC="FIGDIR/small/704426v2_ufig1.gif" ALT="Figure 1"> View larger version (25K): org.highwire.dtl.DTLVardef@15d9c33org.highwire.dtl.DTLVardef@ba0307org.highwire.dtl.DTLVardef@1cd46dorg.highwire.dtl.DTLVardef@be80ca_HPS_FORMAT_FIGEXP M_FIG C_FIG
Shabbir, B.; Oliveira, P. B.; Fernandez-Lima, F.; Saeed, F.
Show abstract
A machine learning approach to molecular formula assignment is crucial for unlocking the full potential of ultra-high resolution mass spectrometry (UHRMS) when analyzing complex mixtures. By combining data-driven models with rigorous benchmarking, the accuracy, consistency, and speed in identifying plausible molecular formulas from vast spectral datasets can be improved. Compared with traditional de novo methods that rely heavily on rule-based heuristics, and manual parameter tuning, machine learning approaches can capture complex patterns in data and adapt more readily to diverse sample types. In this paper, we describe the application of a machine learning methods using the k-nearest neighbors (KNN) algorithm trained on curated chemical formula datasets of UHRMS analysis of dissolved organic matter (DOM) covering the saline river continuum and tropical wet/dry season variability. The influence of the mass accuracy (training set with 0.15-1ppm) was evaluated on a blind test set of DOMs of different geographical origins. A Decision Tree Regressor (DTR) and Random Forest Regressor (RFR) based on mass accuracy (<1ppm) was used. Results from our ML models exhibit 43% more formulas annotated than traditional methods (5796 vs 4047), Model-Synthetic achieved 99.9% assignment rate and annotated/assigned 2x more formulas (8,268 vs 4047). DTR and RFR achieved formula-level accuracies (FA) of 86.5% and 60.4%, respectively. Overall, results show an increase in formula assignment when compared with traditional methods. This ultimately enables more reliable characterization of complex natural and engineered systems, supporting advances in fields such as environmental science, metabolomics, and petroleomics. Furthermore, the novel data set produced for this study is made publicly available, establishing an initial benchmark for molecular formula assignment in UHRMS using machine learning. The dataset and code are publicly available at: https://github.com/pcdslab/dom-formula-assignment-using-ml CCS CONCEPTSComputing methodologies [->] Machine Learning [->] Learning paradigms [->] Supervised Learning
Li, K.; Liu, K.; Fulcher, J. M.; Tang, H.; Liu, X.
Show abstract
Mass spectral libraries have become essential resources for training deep learning (DL) models for spectral prediction and de novo sequencing in bottom-up mass spectrometry (BU-MS). Compared with BU-MS, top-down MS (TD-MS) offers unique advantages for characterizing intact proteoforms by analyzing proteoforms without enzymatic digestion. Despite these advantages, large-scale spectral libraries for TD-MS are currently lacking. Here we present TopRepo, the first comprehensive repository of TD-MS spectra, comprising more than 18 million spectra acquired from 12 species across eight types of mass spectrometers. Using TopRepo, we constructed a large-scale top-down spectral library containing over 5 million spectra with curated proteoform and fragment-ion annotations. We demonstrate that TopRepo enables pan-dataset analyses of N-terminal processing, mass shifts, and other proteoform characteristics identified by TD-MS. Furthermore, we show that the TopRepo spectral library substantially improves proteoform identification through spectral library searching and supports the training of DL models for high-accuracy top-down spectral prediction.
Coyle, E.; Lacombe-Rastoll, A.; Roux-Dalvai, F.; Leclercq, M.; Bories, P.; Berube, E.; Gotti, C.; Bekker-Jensen, D.; Bache, N.; Isabel, S.; Droit, A.
Show abstract
BackgroundRapid and accurate identification of urinary tract infection (UTI) pathogens is critical for effective treatment and combating antimicrobial resistance. Conventional culture-based diagnostics are slow, and standard tandem mass spectrometry workflows are resource-intensive. MethodsWe present a proof-of-concept workflow that integrates high-resolution data-independent acquisition (DIA) MS/MS on the Thermo Scientific Orbitrap Astral with MS1-only spectra from the Orbitrap Exploris 480. DIA data establish a reference panel of pathogen-specific peptides, which are then identified in MS1 spectra from urine samples. Machine learning models trained on these matched MS1 features were used to classify eight common uropathogens and non-infected controls across synthetic inoculations, pure cultures, and clinical patient samples. ResultsThe approach accurately distinguished bacterial species in both controlled inoculated samples and clinical patient samples, achieving a Matthews Correlation Coefficient (MCC) of 0.924 on held-out test data and 0.77 on patient samples. ConclusionsThis proof-of-concept demonstrates that pairing DIA-derived peptide panels with MS1-only data acquired on a cost-effective instrument suitable for routine analysis, enables rapid, culture-free identification of UTI pathogens. The method provides a scalable, high-throughput platform suitable for clinical applications and establishes a foundation for broader biomarker discovery and potential quantitative workflows.
Zhang, G.; Vincent, E. C.; Disselkoen, S. M.; Dodds, J. N.; DuVal-Smith, Q.; Patan, A.; Mohanty, I.; Deleray, V.; Zhang, J.; Thiessen, P. A.; Bolton, E. E.; Schymanski, E. L.; Dorrestein, P. C.; Theriot, C. M.; Baker, E. S.
Show abstract
Microbes and bile acids are tightly intertwined, especially in the gut. While the liver produces primary bile acids from cholesterol, gut bacteria transform these into diverse secondary forms which act as powerful signaling molecules, influencing host metabolism and immune function. Since bile acid changes are increasingly linked to health and disease, their accurate measurement in the gut and circulation is essential. Analytical evaluations, however, remain challenging as many bile acids co-elute in liquid chromatography (LC), share identical precursor masses in mass spectrometry (MS), and produce similar tandem mass spectrometry (MS/MS) spectra. As a result, conventional LC-MS/MS workflows struggle to differentiate bile acids, motivating the addition of orthogonal separations such as ion mobility spectrometry (IMS). Here, we assess optimal bile acid extraction parameters for stool, serum, and plasma; compare LC conditions; and assess electrospray ionization performance across polarities. Additionally, we created a multidimensional reference library containing LC retention times, IMS collision cross section values, and accurate precursor masses for 280 unique bile acids (264 endogenous and 16 deuterium-labeled species) including unconjugated, host-conjugated, and microbially conjugated bile acids. This multidimensional library empowers bile acid identification in complex samples and enables a more comprehensive exploration of their biological roles and disease associations.
Brook, J. R.; Tong, X.; Wong, A. Y.; Weitman, M.; Boire, A.; Kanarek, N.; Petrova, B.
Show abstract
IntroductionRetinoids are bioactive vitamin A derivatives that regulate cellular differentiation and gene expression, yet their reliable quantification remains challenging due to low abundance, structural isomerism, and sensitivity to ionization conditions while handling. ObjectivesIn this study, we performed a systematic optimization of liquid chromatography-mass spectrometry (LC-MS)-based detection of retinoids across tissues and biofluids. MethodsChromatographic separation, adduct formation, ionization parameters, fragmentation behavior, and extraction procedures were evaluated in an integrated workflow. ResultsChromatographic conditions influenced not only retention time but also the ionic species detected, affecting precursor selection for MS{superscript 2} analysis. Retinoids exhibited compound-dependent responses to electrospray ionization and collision energy, requiring tailored acquisition parameters. Extraction experiments demonstrated differential recovery among retinoid classes and revealed matrix-dependent behavior, indicating that protocols used for tissues cannot be directly transferred to low-abundance biofluids. Using optimized conditions, retinoids were detected in mouse cerebrospinal fluid (CSF) at concentrations approaching the analytical detection limit, where MS{superscript 2} confirmation was necessary for reliable identification. ConclusionTogether, our results provide a framework for reproducible retinoid profiling across biological matrices and enables comparative studies of retinoid biology in low-volume and low-abundance biofluids.
Okuda, Y.; Konno, R.; Taguchi, T.; Itakura, M.; Matsui, T.; Miyatsuka, T.; Ohara, O.; Kawashima, Y.; Kodera, Y.
Show abstract
Plasma contains diverse bioactive peptides that play crucial roles in maintaining homeostasis and regulating disease responses. However, the presence of peptides derived from high-abundance proteins such as albumin makes comprehensive analysis of native peptides secreted by organs challenging. This study aimed to establish a highly sensitive plasma peptidomic approach by combining data-independent acquisition (DIA) with spectral libraries of plasma and organs. First, peptides were extracted from plasma and eleven organ types using a high-yield peptide extraction method, the differential solubilization method. These peptides were then measured via data-dependent acquisition (DDA) analysis using a timsTOF HT for constructing empirical spectral library. Subsequently, DIA-MS data from plasma samples were measured and analyzed using this spectral library. This strategy achieved identification of, on average, over 5,500 peptides per run, with over 2,000 organ-derived peptides including 19 known bioactive peptides. The novel strategy proposed here enables highly sensitive quantitative analysis of organ-derived peptides in plasma, linking them to their secreting organs. It is expected to substantially contribute not only to the discovery of biomarkers and novel bioactive peptides but also to elucidating the pathophysiology of systemic diseases.
Buur, L. M.; Winkler, S.; Dorfer, V.
Show abstract
Open modification search (OMS) strategies have gained popularity in mass spectrometry-based proteomics for identification of peptides carrying unknown or unexpected post-translational modifications. However, most OMS search engines report only the overall mass difference between the precursor and the matched peptide and do not explicitly identify or score combinations of multiple modifications at the peptide-spectrum match (PSM) level, leaving the interpretation of mass shifts up to the end user and to using downstream analysis tools. Here, we introduce MS Andrea, a novel OMS search engine developed to directly identify and score combinations of up to four variable modifications per peptide without having to predefine them. MS Andrea uses a sequence tag-based strategy to efficiently filter candidate peptides prior to scoring. Remaining candidates are evaluated using the MS Amanda scoring function, first considering fixed modifications only, followed by a second scoring stage in which combinations of modifications from the Unimod database are considered based on the observed mass difference and matched to the spectrum. We evaluated MS Andrea using phosphopeptide datasets from HeLa cells and Arabidopsis thaliana and compared its performance with the widely used OMS engines MSFragger and Sage. Across datasets, MS Andrea identified the highest number of PSMs at 1% false discovery rate while achieving comparable peptide-level identifications. Importantly, MS Andrea directly reports modification identities and sites at the PSM level and enables the identification of peptides having up to four variable modifications. Together, these results demonstrate that MS Andrea facilitates more detailed and interpretable characterization of peptide modifications while maintaining competitive identification performance in OMS-based proteomic analyses. TOC Graphic O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=132 SRC="FIGDIR/small/714851v1_ufig1.gif" ALT="Figure 1"> View larger version (19K): org.highwire.dtl.DTLVardef@52f65forg.highwire.dtl.DTLVardef@acf4e3org.highwire.dtl.DTLVardef@10171caorg.highwire.dtl.DTLVardef@1d594ad_HPS_FORMAT_FIGEXP M_FIG C_FIG
Guffick, C.; Rincon Pabon, J. P.; Griffiths, D.; Inaba-Inoue, S.; Beis, K.; Politis, A.
Show abstract
The structural study of membrane proteins has traditionally relied on detergent-based extraction from cellular membranes. Although native-like reconstitution approaches have advanced, fully understanding membrane protein dynamics requires examining them within their native membrane environment. Hydrogen-deuterium exchange mass spectrometry (HDX-MS) is a powerful method for probing structural dynamics in reconstituted systems, but the presence of the lipid bilayer introduces considerable complexity, limiting broader adoption under physiological conditions. Here, we present the first fully automated HDX-MS platform incorporating a two-stage delipidation workflow. We applied this approach to monitor the dynamics of the ABC transporter MsbA in isolated inner membrane vesicles (IIMVs) from Escherichia coli through its ATPase cycle. IIMVs revealed distinct dynamic features within the nucleotide binding domains and substrate binding cavity, highlighting physiologically relevant motions not observed with detergent solubilised MsbA. This platform significantly advances HDX-MS and underscores the importance of studying membrane proteins in native lipid environments.
Shen, J.; Polasky, D. A.; Jager, S.; Yu, F.; Heck, A. J. R.; Reiding, K. R.; Nesvizhskii, A. I.
Show abstract
Glycosylation is one of the most important, but also most complex, post-translational modifications of proteins, playing a pivotal role in various pathological processes. Mass spectrometry-based large-scale glycoproteomics analysis offers a powerful approach to explore the fundamental roles of glycosylation in both physiological and pathological contexts. Traditionally, DDA glycopeptide assignment relies on information-dense MS2 spectra, containing sufficient fragmentation information to identify both the peptide and glycan moieties. Achieving this fragmentation can be difficult, especially for low-abundant glycopeptides and/or large, complex glycans. These glycopeptides are often not assigned using current data analysis software, yet they can be of biological relevance. Here, we introduce a method called match-between-glycans (MBG), which expands glycopeptide identification while maintaining the existing glycoproteome analysis workflow. MBG enables expanding the set of identified glycopeptides to include those without MS2 spectra, or with lower quality MS2 spectra, by looking for MS1 signals displaced from other identified glycopeptides by one or multiple monosaccharide unit(s). MBG can also identify glycans not included in the glycan database, such as those containing adducts or modifications, allowing these glycans to be recovered without a drastic expansion of the search space. Combined with target-decoy FDR control, we show this method is capable of accurately expanding glycopeptide identifications and providing a more complete quantitative profile of glycosylation at each glycosite. MBG is fully integrated into the glycoproteomics workflows in FragPipe, allowing seamless, one-click operation.
Plekhova, V.; Van de Velde, N.; VandenBerghe, A.; Diana Di Mavungu, J.; Vanhaecke, L.
Show abstract
Ambient metabolomics techniques such as laser-assisted rapid evaporative ionization mass spectrometry (LA-REIMS) enable fast, preparation-free fingerprinting of biological samples but are inherently limited by spectral congestion in the absence of chromatographic separation. While ion mobility spectrometry provides additional gas-phase separation, maintaining ion transmission under the transient signals characteristic of laser desorption, remains analytically challenging. Here, we define operating conditions for cyclic traveling-wave ion mobility spectrometry (cIMS) that preserve transmission under LA-REIMS duty-cycle constraints and systematically evaluate how cIMS integration reshapes biofluid fingerprints and enhances chemical specificity in chromatography-free metabolomics analysis. Under optimized single-pass conditions, cIMS separation reorganized LA-REIMS spectra into structured mass/mobility feature domains, enabling selective mobility-based filtering of matrix-derived salt cluster ions. This reduced non-biological background contributions by up to 35% of total spectral intensity while preserving over 90% of detected untargeted features. Although cIMS operation introduced a sensitivity penalty relative to time-of-flight-only acquisition, approximately 80% of the total ion current was recovered under optimized conditions. Mobility-resolved data revealed coherent homologous series and class-specific structural trends, particularly for lipids, supporting class-level annotation. Analysis of 101 metabolite and lipid standards covering a broad physicochemical range (logP -5.30 to 19.40) demonstrated comprehensive molecular coverage, high mass accuracy (mean 2.4 ppm), and good agreement with reference CCS values (mean deviation 4.0%), with isomer separation observed for biologically important secondary bile acids in extended separation cycles. Collectively, these results establish LA-REIMS-cIMS as a practical analytical strategy for enhancing chemical specificity and spectral interpretability in support of high-throughput large-scale metabolic fingerprinting. O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=147 SRC="FIGDIR/small/709786v1_ufig1.gif" ALT="Figure 1"> View larger version (42K): org.highwire.dtl.DTLVardef@18a2dfdorg.highwire.dtl.DTLVardef@d165d6org.highwire.dtl.DTLVardef@1750291org.highwire.dtl.DTLVardef@fbbce9_HPS_FORMAT_FIGEXP M_FIG O_FLOATNOGraphical abstractC_FLOATNO Ion mobility spectrometry adds an orthogonal gas-phase separation to LA-REIMS, reorganizing complex biofluid spectra into distinct mass-mobility feature bands and improving molecular resolution in rapid ambient ionization metabolomics. C_FIG
Zelter, A.; Riffle, M.; Merrihew, G. E.; Mutawe, B.; Maurais, A.; Inman, J. L.; Celniker, S. E.; Mao, J.-H.; Wan, K. H.; Snijders, A. M.; Wu, C. C.; MacCoss, M. J.
Show abstract
Dogma suggests protein quantification is a pre-requisite to LC-MS/MS based proteomics studies. Such quantification allows a standardized ratio of sample to digestion enzyme and enables physical normalization of protein digest loaded onto the mass spectrometer for analysis. Most proteomics studies include these steps. However, there are significant costs in time, money and experimental complexity, associated with performing protein quantification and physical normalization for every sample, especially for larger studies. Proteomics data analysis pipelines typically include computational normalization strategies to compensate for unavoidable systematic biases. These strategies also have the potential to compensate for avoidable variation such as omitting sample amount normalization. Here we investigate the effects of either physically normalizing the amount of protein for each individual sample or leaving it unnormalized. Our results show the relationship between increased protein amount variation in sample input, and the variance of quantified relative abundances of peptides and proteins output after data analysis. The experiments presented here suggest that protein quantification and physical normalization steps can be omitted from some quantitative proteomic experiments without incurring an unacceptable increase in measurement variability after computational normalization has been applied. This work will enable important time and cost saving optimizations to be made to many proteomics workflows.
Stewart, H.; Shuken, S. R.; Rathje, C.; Kraegenbring, J.; Zeller, M.; Arrey, T. N.; Hagedorn, B.; Denisov, E.; Ostermann, R.; Grinfeld, D.; Petzoldt, J.; Mourad, D.; Cochems, P.; Bonn, F.; Delanghe, B.; Wiedemeyer, M.; Wagner, A.; Bomgarden, R.; Frost, D. C.; Zuniga, N. R.; Rad, R.; Paulo, J. A.; Damoc, E.; Makarov, A.; Zabrouskov, V.; Hock, C.; Gygi, S. P.
Show abstract
Tandem mass tags (TMT) allow highly multiplexed and thus high-throughput, precisely quantitative proteomic analysis. Incorporation of additional deuterated reporter channels has near-doubled the multiplexation achieved with Thermo Scientific TMTpro reagents from 18 to 35-plex but requires extremely high [~]100k analyzer resolving power at m/z 128 to differentiate and quantify reporter ion channels, far beyond any single reflection time-of-flight analyzer, and exceeding the multi-reflection Thermo Scientific Astral analyzer in its standard operation. A multi-pass mode of Astral operation has been developed for the Thermo Scientific Orbitrap Astral Zoom mass spectrometer that triples the ion path to 90 m, more than doubling resolving power for a narrow m/z range. This "TMT HR mode" has been integrated into a new method of TMT proteomic analysis that splits regular MS2 analysis of labeled peptides into paired measurements comprising wide mass range scans for peptide identification, and TMT HR mode scans for reporter ion quantification. The method has been shown to accurately quantify 32-plex labeled HeLa protein lysate and provide far greater depth of analysis as state-of-the-art Orbitrap-only methods, while analysis of 11-plex labeled yeast showed no analytical depth sacrificed vs regular Orbitrap Astral TMT analysis. Further comparative measurements of a 2-cell line 35-plex sample demonstrated greater analytical depth, and similar quantitative precision, to "gold standard" Orbitrap MS3 methods.
Krieger, C.; Everton, Z.; You, Y.; Lewis, B.; Bank, T.; Burnet, M. C.; Williams, S.; Walukiewicz, H.; Rao, C.; Wolfe, A.; Payne, S. H.; Nakayasu, E. S.
Show abstract
Evolutionary conservation has been considered a hallmark of essential basic functions in cells. Therefore, the study of evolutionarily conserved post-translational modifications (PTMs) can provide insight into their role in protein function. In this context, mass spectrometry can identify and quantify thousands of PTM sites. However, a major bottleneck lies in analyzing the large amounts of data collected by the mass spectrometer. Here we address the need for a protein sequence alignment tool for multiple PTMs across several species. We developed a tool named PTMOverlay that takes peptide identification output files and overlays PTM sites onto multiple protein sequence alignments. Examining 31 bacteria isolates, we combined their protein sequences with select PTM types, including acetylation, phosphorylation, monomethylation, dimethylation, and trimethylation. The tool revealed a variety of conserved modification sites on the bacterial central carbon metabolism. Further structural analysis revealed possible interactions between methylated arginine and lysine residues with phosphothreonine/serine sites on the homodimer interface of enolase. Overall, this tool can parse large amounts of mass spectrometry data and allows for more informed and efficient selection of sites for future studies of protein function.
Cologna, S. M.; Pathmasiri, K. C.
Show abstract
Niemann-Pick Disease Type C1 (NPC1) is a fatal, neurodegenerative disorder, characterized by lysosomal lipid accumulation and dysmyelination. Previous studies have documented some lipid abnormalities in the null mouse (Npc1-/-) focused on the whole brain and liver. However, the specific lipidomic alterations in severely affected brain regions, such as cerebellum and isolated myelin remain understudied. We present a comprehensive LC-MS-based lipidomic analysis of the cerebellum and cortex of Npc1-/- mice during disease progression stages, along with the first comprehensive characterization of the myelin lipidome in NPC1 disease. Our results reveal that the cerebellum accumulates lipid species, including sphingolipids and glycerophospholipids progressively, while the cortex shows an overall decline in lipid levels, indicating region-specific lipid dysregulation. Notably, bis(monoacylglycero)phosphates and their precursors--including lysophosphatidylglycerol and hemibismonoacylglycerophosphate exhibit significant accumulation, with a preference for docosahexaenoic acid (DHA)-containing species. Despite known cholesterol storage defects in NPC1, we observed reduced free cholesterol levels in both regions, which we attribute to myelin loss. Myelin-specific lipidomics demonstrated extensive dysregulation, particularly in cortical myelin, including severe losses in sulfatides, ether-lipids, and acylcarnitine, alongside striking accumulation of hydroxy-ceramides. These findings identify novel lipid alterations in brain subregions and myelin, offering critical insight into the lipid perturbations under the loss of NPC1, and highlight lipid targets that may be crucial for therapeutic intervention and biomarker development.
Schramm, T.; Gillet, L.; Reber, V.; de Souza, N.; Gstaiger, M.; Picotti, P.
Show abstract
Peptide-level analyses are becoming increasingly popular in mass spectrometry-based proteomics and are being applied, for example, in immunopeptidomics, structural proteomics, and analyses of post-translational modifications. In such analyses, peptides that are not biologically meaningful but instead arise as artifacts prior to mass spectrometry analysis pose the risk of data misinterpretation. Here, we describe an approach based on retention time analysis and precise chromatographic peak matching to identify peptides generated by in-source fragmentation (ISF), which occurs between chromatographic separation of peptide mixtures and the first mass filter of a tandem mass spectrometer (MS). To understand the prevalence and properties of ISF, we generated 13 proteomics datasets and analyzed them along with additional 25 previously published datasets spanning a broad range of sample types, MS, and proteomics approaches including classical bottom-up proteomics, immunopeptidomics, structural proteomics, and phosphoproteomics. We found that, in typical trypsin-digested samples on average 1 % of fully-tryptic peptides and 22 % of semi-tryptic peptides originated from ISF. However, we observed large variations between datasets, and in-source fragments exceeded, in some cases, a third of the total peptide identifications. The extent of ISF was dependent on the peptide sequence, the instrument, method parameters, and sample complexity. Although ISF did not impair relative quantification across samples, it generated peptides that could be misinterpreted qualitatively, inflated peptide identifications, and comprised up to 37 percent of peptides shorter than 9 amino acids in immunopeptidomics datasets. We propose that, for peptide-centric applications, our open-source ISF detection approach be used to re-annotate peptides generated by ISF and remove them to avoid misinterpretation of data. ISF is an increasing concern with improving mass spectrometers, as they enable detection of an ever-increasing number of m/z features, including low abundance features like ISF products. Our work thus addresses a growing issue in proteomics and presents solutions to mitigate the impact of in-source fragment peptides. In the future, improved feature detection algorithms may enable elucidation of new ISF patterns affecting side chains that have been missed so far, which could contribute to explaining the vast space of as-yet unannotated proteomics data.
Dunlop, F. M.; Mason, S.; Hafizi Rastabi, N.; Alexander, S. E.; Robatjazi, S.; Davis, J.; Laird, C.; Kang, T.; Mathivanan, S. E.; Russell, A. P.
Show abstract
Extracellular vesicles (EVs) are promising biomarkers, yet their proteomic analysis from plasma is hampered by low abundance and co-purification of contaminants (e.g., lipoproteins, platelets) and technical variability, particularly in small-volume animal models. We developed and validated a modular protocol integrating Size Exclusion Chromatography (SEC) with Strong Anion Exchange (SEC-SAX) specifically tailored for quantitative LC-MS proteomics from small starting volumes (150 l of plasma). SEC alone successfully removed 99% of Albumin, and the SAX step significantly enriched EVs over contaminating lipoproteins. Downstream single pot solid phase enhanced (SP3) sample prep and STAGE tip solid phase extraction ensured maximum proteome depth. Critical confounding factors were objectively assessed: Platelet Factor 4 (PF4) was confirmed as a highly sensitive platelet marker, confirming the necessity of meticulous plasma preparation. Sample hemolysis impacted the plasma EV proteome data. As such, an objective measure (nanodrop spectrophotometer) of hemolysis and exclusion of hemolysed samples (heme >0.3 mg/ml) is recommended. The protocol is applicable to both human and mouse plasma as demonstrated by EV enrichment and quantification of biomarker proteins associated with neurodegenerative diseases from eight individual mouse plasma samples. Manuscript HighlightsO_LIDevelopmental workflow for a quantitative SEC-SAX protocol for EV proteomics from small plasma volumes (150 l). C_LIO_LIA range of variables tested including SAX beads amount, digestion buffer, digestion time, STAGE tip solid phase extraction, SAX elution buffer and sample filtration. C_LIO_LIThe SAX step significantly enhances EV proteome depth by increasing EV purity in relation to ApoB lipoproteins. C_LIO_LIShows the impact of the major confounding factors of sample hemolysis and platelet contamination on the EV proteome. C_LIO_LIPlatelet contamination increases the number and abundance of proteins detected including known disease biomarkers and sample hemolysis is associated with proteins derived from platelet and red blood cell derived EVs. C_LIO_LIPlatelet Factor 4 (PF4) is identified and confirmed as a sensitive marker for platelet contamination. C_LIO_LIApplicable to both human and mouse plasma. C_LI
Bekbergenova, M.; Jiang, T.; NOTHIAS, L.-F.; Bittremieux, W.
Show abstract
MotivationThe quality of tandem mass spectra critically determines metabolite identifiability in untargeted metabolomics, yet optimizing MS2 acquisition parameters experimentally is costly, time-consuming, and infeasible across the full diversity of samples and instruments. A key obstacle to computational quality assessment is the absence of reliable negative labels: most MS2 spectra remain unannotated not because they are low quality, but because the corresponding compounds are absent from reference libraries. This label ambiguity fundamentally limits supervised learning approaches and complicates acquisition-time decision making. ResultsWe present a deep learning framework that predicts the probability that an MS2 scan will be identifiable using only the preceding MS1 spectrum and instrument acquisition parameters, without inspecting the MS2 spectrum itself. The problem is formulated as positive-unlabeled learning and addressed using a non-negative positive-unlabeled objective, enabling robust training despite missing negative labels. Trained on over eight million MS2 scans from public Orbitrap metabolomics datasets and evaluated on laboratory-disjoint benchmarks, the model recovered 90% of known identifiable spectra in a held-out test set. Predicted probabilities generalized to unseen laboratories and stratified unlabeled spectra in a physicochemically meaningful manner. Independent validation demonstrated that high predicted quality is associated with increased structural explainability, richer fragmentation patterns, canonical precursor charge states, and reduced spectral interference. These results indicate that MS2 identifiability can be anticipated from precursor context and acquisition settings alone. Availability and implementationCode is available at https://github.com/bittremieuxlab/pu_ms2_identifiability. Model weights and processed datasets are available at 10.5281/zenodo.18266932.
Xia, R.; Ahn, L.; Burkhauser, M.; Youngs, R.; Bertin, M. J.
Show abstract
Cyanobacterial harmful algal blooms (cyanoHABs) are a major ecological and public health concern, commonly monitored for hepatotoxic microcystins and cylindrospermopsins and neurotoxic anatoxins and saxitoxins. However, the broader suite of bioactive metabolites produced during blooms remains under characterized. Here, we interrogated a chromatography fraction library generated from a cyanoHAB in Muskegon, Michigan. From this library, we isolated two new micropeptins (1 and 2), including an analog bearing a bishomologated tyrosine residue, and we confirmed the structure of ferintoic acid C (3). Structures were established using complementary spectrometric and spectroscopic methods. To expand chemical space coverage beyond isolated compounds, we analyzed LC-MS/MS data using the GNPS2 Analysis Hub query language for product ion searching, enabling annotation of cyanopeptide classes and class-specific modifications across the fraction set, which provided a practical and user-friendly approach for identifying cyanopeptide classes. One of the new micropeptins (1) exhibited moderate inhibition of neutrophil elastase, consistent with roles in ecological interactions and potential relevance to human exposure. Analysis of field samples from ongoing Lake Erie blooms showed recurring micropeptins but no evidence of microcystins. Together, these results challenge microcystin-centric assessments of bloom hazard and support expanded monitoring of non-microcystin cyanopeptides. SYNOPSISRoutine cyanoHAB monitoring targets few regulated toxins; we reveal abundant, under characterized cyanopeptides and enable rapid class-level annotation across datasets with a new LC-MS/MS analysis pipeline. GRAPHICAL ABSTRACT O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=114 SRC="FIGDIR/small/704577v1_ufig1.gif" ALT="Figure 1"> View larger version (23K): org.highwire.dtl.DTLVardef@1849d1eorg.highwire.dtl.DTLVardef@16729a8org.highwire.dtl.DTLVardef@1dffe58org.highwire.dtl.DTLVardef@b36a52_HPS_FORMAT_FIGEXP M_FIG C_FIG